knowledge system
What AI doesn't know: we could be creating a global 'knowledge collapse' Deepak Varuvel Dennison
What AI doesn't know: we could be creating a global'knowledge collapse' As GenAI becomes the primary way to find information, local and traditional wisdom is being lost. And we are only beginning to realise what we're missing This article was originally published as'Holes in the web' on Aeon.co A few years back, my dad was diagnosed with a tumour on his tongue - which meant we had some choices to weigh up. My family has an interesting dynamic when it comes to medical decisions. While my older sister is a trained doctor in western allopathic medicine, my parents are big believers in traditional remedies. Having grown up in a small town in India, I am accustomed to rituals. My dad had a ritual, too. Every time we visited his home village in southern Tamil Nadu, he'd get a bottle of thick, pungent, herb-infused oil from a vaithiyar, a traditional doctor practising Siddha medicine. It was his way of maintaining his connection with the kind of medicine he had always known and trusted.
- Leisure & Entertainment > Sports (0.68)
- Education (0.68)
- Government > Regional Government > North America Government > United States Government (0.46)
BhashaBench V1: A Comprehensive Benchmark for the Quadrant of Indic Domains
Devane, Vijay, Nauman, Mohd, Patel, Bhargav, Wakchoure, Aniket Mahendra, Sant, Yogeshkumar, Pawar, Shyam, Thakur, Viraj, Godse, Ananya, Patra, Sunil, Maurya, Neha, Racha, Suraj, Singh, Nitish Kamal, Nagpal, Ajay, Sawarkar, Piyush, Pundalik, Kundeshwar Vijayrao, Saluja, Rohit, Ramakrishnan, Ganesh
The rapid advancement of large language models(LLMs) has intensified the need for domain and culture specific evaluation. Existing benchmarks are largely Anglocentric and domain-agnostic, limiting their applicability to India-centric contexts. To address this gap, we introduce BhashaBench V1, the first domain-specific, multi-task, bilingual benchmark focusing on critical Indic knowledge systems. BhashaBench V1 contains 74,166 meticulously curated question-answer pairs, with 52,494 in English and 21,672 in Hindi, sourced from authentic government and domain-specific exams. It spans four major domains: Agriculture, Legal, Finance, and Ayurveda, comprising 90+ subdomains and covering 500+ topics, enabling fine-grained evaluation. Evaluation of 29+ LLMs reveals significant domain and language specific performance gaps, with especially large disparities in low-resource domains. For instance, GPT-4o achieves 76.49% overall accuracy in Legal but only 59.74% in Ayurveda. Models consistently perform better on English content compared to Hindi across all domains. Subdomain-level analysis shows that areas such as Cyber Law, International Finance perform relatively well, while Panchakarma, Seed Science, and Human Rights remain notably weak. BhashaBench V1 provides a comprehensive dataset for evaluating large language models across India's diverse knowledge domains. It enables assessment of models' ability to integrate domain-specific knowledge with bilingual understanding. All code, benchmarks, and resources are publicly available to support open research.
- North America > United States (0.14)
- Asia > India > Maharashtra (0.04)
- Asia > Middle East > Jordan (0.04)
- (17 more...)
- Law > Statutes (1.00)
- Health & Medicine (1.00)
- Food & Agriculture > Agriculture (1.00)
- Government > Regional Government > Asia Government > India Government (0.46)
Arrows of Math Reasoning Data Synthesis for Large Language Models: Diversity, Complexity and Correctness
Chen, Sirui, Tian, Changxin, Hu, Binbin, Chen, Kunlong, Liu, Ziqi, Zhang, Zhiqiang, Zhou, Jun
Enhancing the mathematical reasoning of large language models (LLMs) demands high-quality training data, yet conventional methods face critical challenges in scalability, cost, and data reliability. To address these limitations, we propose a novel program-assisted synthesis framework that systematically generates a high-quality mathematical corpus with guaranteed diversity, complexity, and correctness. This framework integrates mathematical knowledge systems and domain-specific tools to create executable programs. These programs are then translated into natural language problem-solution pairs and vetted by a bilateral validation mechanism that verifies solution correctness against program outputs and ensures program-problem consistency. We have generated 12.3 million such problem-solving triples. Experiments demonstrate that models fine-tuned on our data significantly improve their inference capabilities, achieving state-of-the-art performance on several benchmark datasets and showcasing the effectiveness of our synthesis approach.
- Asia > China > Zhejiang Province > Hangzhou (0.06)
- Asia > South Korea > Seoul > Seoul (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Thailand > Bangkok > Bangkok (0.04)
We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning
Qiao, Runqi, Tan, Qiuna, Yang, Peiqing, Wang, Yanzi, Wang, Xiaowan, Wan, Enhui, Zhou, Sitong, Dong, Guanting, Zeng, Yuchen, Xu, Yida, Wang, Jie, Sun, Chong, Li, Chen, Zhang, Honggang
Multimodal Large Language Models (MLLMs) have demonstrated impressive capabilities across various tasks, but still struggle with complex mathematical reasoning. Existing research primarily focuses on dataset construction and method optimization, often overlooking two critical aspects: comprehensive knowledge-driven design and model-centric data space modeling. In this paper, we introduce We-Math 2.0, a unified system that integrates a structured mathematical knowledge system, model-centric data space modeling, and a reinforcement learning (RL)-based training paradigm to comprehensively enhance the mathematical reasoning abilities of MLLMs. The key contributions of We-Math 2.0 are fourfold: (1) MathBook Knowledge System: We construct a five-level hierarchical system encompassing 491 knowledge points and 1,819 fundamental principles. (2) MathBook-Standard & Pro: We develop MathBook-Standard, a dataset that ensures broad conceptual coverage and flexibility through dual expansion. Additionally, we define a three-dimensional difficulty space and generate 7 progressive variants per problem to build MathBook-Pro, a challenging dataset for robust training. (3) MathBook-RL: We propose a two-stage RL framework comprising: (i) Cold-Start Fine-tuning, which aligns the model with knowledge-oriented chain-of-thought reasoning; and (ii) Progressive Alignment RL, leveraging average-reward learning and dynamic data scheduling to achieve progressive alignment across difficulty levels. (4) MathBookEval: We introduce a comprehensive benchmark covering all 491 knowledge points with diverse reasoning step distributions. Experimental results show that MathBook-RL performs competitively with existing baselines on four widely-used benchmarks and achieves strong results on MathBookEval, suggesting promising generalization in mathematical reasoning.
- Europe > Austria > Vienna (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Information Technology (0.67)
- Education > Educational Technology (0.46)
AI Thinking as a Meaning-Centered Framework: Reimagining Language Technologies Through Community Agency
While language technologies have advanced significantly, current approaches fail to address the complex sociocultural dimensions of linguistic preservation. AI Thinking proposes a meaning-centered framework that would transform technological development from creating tools FOR communities to co-creating solutions WITH them. This approach recognizes that meaningful solutions emerge through the interplay of cultural understanding, community agency, and technological innovation. The proposal articulates a holistic methodology and a five-layer technological ecosystem where communities maintain control over their linguistic and cultural knowledge representation. This systematic integration of community needs, cultural preservation, and advanced capabilities could revolutionize how we approach linguistic diversity preservation in the digital age.
- South America > Colombia > Meta Department > Villavicencio (0.04)
- North America > United States > Indiana (0.04)
- North America > United States > Arizona (0.04)
- (6 more...)
- Information Technology (0.68)
- Education (0.67)
Designing an LLM-Based Copilot for Manufacturing Equipment Selection
Werheid, Jonas, Melnychuk, Oleksandr, Zhou, Hans, Huber, Meike, Rippe, Christoph, Joosten, Dominik, Keskin, Zozan, Wittstamm, Max, Subramani, Sathya, Drescher, Benny, Göppert, Amon, Abdelrazeq, Anas, Schmitt, Robert H.
Effective decision-making in automation equipment selection is critical for reducing ramp-up time and maintaining production quality, especially in the face of increasing product variation and market demands. However, limited expertise and resource constraints often result in inefficiencies during the ramp-up phase when new products are integrated into production lines. Existing methods often lack structured and tailored solutions to support automation engineers in reducing ramp-up time, leading to compromises in quality. This research investigates whether large-language models (LLMs), combined with Retrieval-Augmented Generation (RAG), can assist in streamlining equipment selection in ramp-up planning. We propose a factual-driven copilot integrating LLMs with structured and semi-structured knowledge retrieval for three component types (robots, feeders and vision systems), providing a guided and traceable state-machine process for decision-making in automation equipment selection. The system was demonstrated to an industrial partner, who tested it on three internal use-cases. Their feedback affirmed its capability to provide logical and actionable recommendations for automation equipment. More specifically, among 22 equipment prompts analyzed, 19 involved selecting the correct equipment while considering most requirements, and in 6 cases, all requirements were fully met.
- Asia > China > Hong Kong (0.05)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
- (5 more...)
The Rise and Down of Babel Tower: Investigating the Evolution Process of Multilingual Code Large Language Model
Chen, Jiawei, Chen, Wentao, Su, Jing, Xu, Jingjing, Lin, Hongyu, Ren, Mengjie, Lu, Yaojie, Han, Xianpei, Sun, Le
Large language models (LLMs) have shown significant multilingual capabilities. However, the mechanisms underlying the development of these capabilities during pre-training are not well understood. In this paper, we use code LLMs as an experimental platform to explore the evolution of multilingual capabilities in LLMs during the pre-training process. Based on our observations, we propose the Babel Tower Hypothesis, which describes the entire process of LLMs acquiring new language capabilities. During the learning process, multiple languages initially share a single knowledge system dominated by the primary language and gradually develop language-specific knowledge systems. Experimental results show that the internal state changes of the LLM are consistent with our Babel Tower Hypothesis. Building on these insights, we propose a novel method to construct an optimized pre-training corpus for multilingual code LLMs, which significantly outperforms LLMs trained on the original corpus. The proposed Babel Tower Hypothesis provides new insights into designing pre-training data distributions to achieve optimal multilingual capabilities in LLMs. A united human race speaking a single language migrates to Shinar where they agree to build a great city with a tower that would reach the sky. Yahweh, observing these efforts and remarking on humanity's power in unity, confounds their speech so that they can no longer understand each other and scatters them around the world, leaving the city unfinished.
- Asia > Singapore (0.05)
- North America > Mexico > Mexico City > Mexico City (0.04)
- Asia > Indonesia > Bali (0.04)
- (6 more...)
Dismantle the knowledge systems that enable genocide
When a book titled Terrorism: A Very Short Introduction, written by the British professor and historian Charles Townshend, was found by police near the pro-Palestine student encampment at Columbia University, it was held up by New York Police Department (NYPD) Deputy Commissioner Kaz Daughtry as evidence of some kind of foreign, radicalising influence on student activism. Apparently, for Daughtry, reading a book on terrorism is evidence of radicalisation. Knowing about terrorism makes you at risk of committing terrorism. Finding a book near a student encampment confirms that pro-Palestine solidarity is linked to terrorism. What Daughtry was arguably trying to do was darken Palestine activism on college campuses across the United States with the association of terrorism.
- North America > United States > New York (0.25)
- Asia > Middle East > Israel (0.15)
- Asia > Middle East > Palestine > Gaza Strip > Gaza Governorate > Gaza (0.06)
- (7 more...)
Learning to Solve Geometry Problems via Simulating Human Dual-Reasoning Process
Xiao, Tong, Liu, Jiayu, Huang, Zhenya, Wu, Jinze, Sha, Jing, Wang, Shijin, Chen, Enhong
Geometry Problem Solving (GPS), which is a classic and challenging math problem, has attracted much attention in recent years. It requires a solver to comprehensively understand both text and diagram, master essential geometry knowledge, and appropriately apply it in reasoning. However, existing works follow a paradigm of neural machine translation and only focus on enhancing the capability of encoders, which neglects the essential characteristics of human geometry reasoning. In this paper, inspired by dual-process theory, we propose a Dual-Reasoning Geometry Solver (DualGeoSolver) to simulate the dual-reasoning process of humans for GPS. Specifically, we construct two systems in DualGeoSolver, namely Knowledge System and Inference System. Knowledge System controls an implicit reasoning process, which is responsible for providing diagram information and geometry knowledge according to a step-wise reasoning goal generated by Inference System. Inference System conducts an explicit reasoning process, which specifies the goal in each reasoning step and applies the knowledge to generate program tokens for resolving it. The two systems carry out the above process iteratively, which behaves more in line with human cognition. We conduct extensive experiments on two benchmark datasets, GeoQA and GeoQA+. The results demonstrate the superiority of DualGeoSolver in both solving accuracy and robustness from explicitly modeling human reasoning process and knowledge application.
Through the Looking-Glass: Transparency Implications and Challenges in Enterprise AI Knowledge Systems
Cortiñas-Lorenzo, Karina, Lindley, Siân, Larsen-Ledet, Ida, Mitra, Bhaskar
Knowledge can't be disentangled from people. As AI knowledge systems mine vast volumes of work-related data, the knowledge that's being extracted and surfaced is intrinsically linked to the people who create and use it. When these systems get embedded in organizational settings, the information that is brought to the foreground and the information that's pushed to the periphery can influence how individuals see each other and how they see themselves at work. In this paper, we present the looking-glass metaphor and use it to conceptualize AI knowledge systems as systems that reflect and distort, expanding our view on transparency requirements, implications and challenges. We formulate transparency as a key mediator in shaping different ways of seeing, including seeing into the system, which unveils its capabilities, limitations and behavior, and seeing through the system, which shapes workers' perceptions of their own contributions and others within the organization. Recognizing the sociotechnical nature of these systems, we identify three transparency dimensions necessary to realize the value of AI knowledge systems, namely system transparency, procedural transparency and transparency of outcomes. We discuss key challenges hindering the implementation of these forms of transparency, bringing to light the wider sociotechnical gap and highlighting directions for future Computer-supported Cooperative Work (CSCW) research.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > United Kingdom (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (5 more...)